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A NOVEL fll-»6 W-ACETYLGLUCOSAMINYI*TRANSFERASE r 
ITS ACCEPTOR MOLECUI^E, I^EUKOSIALIN, AND 
A METHOD FOR CLONING PROTEINS HAVING ENZYMATIC ACTIVITY 

This work was supported by grants CA33000 and 
5 CA33895 awarded by the National Cemcer Institute • The 
United States Government has certain rights in this 
invention. 

BACKGROUND OF THE INVENTION 

FIEIiD OF THE INVENTION 

10 This invention relates generally to the fields of 

biochemistry and molecular biology and more specifically to 
a novel human enzyme, UDP-GlcNAc:Galfll^3GalNAG (GlcNAc to 
GalNAc) J31-*6 J^-acetylglucosaminyltransf erase (core 2 Bl^6 
i7-acetylglucosaminyltransf erase; C2GnT) , and to a novel 

15 acceptor molecule, leukosialin, CD43, for core 2 fll->6 N- 
acetylglucosaminyltransf erase action. The invention 

additionally relates to DNA sequences encoding core 2 131->6 
IZ-acetylglucoscuainyltransf erase and leukosialin, to vectors 
containing a C2GnT DNA sequence or a leukosialin DNA 

20 sequence, to recombinant host cells transformed with such 
vectors and to a method of transient expression cloning in 
CHO cells for identifying and isolating DNA sequences 
encoding specific proteins, using CHO cells expressing a 
suitable acceptor molecule. 

25 BACKGROUND INFORMATION 

Most O-glycosidic oligosaccharides in mammalian 
glycoproteins are linked via /\r-acetylgalactosamine to the 
hydroxyl groups of serine or threonine. These O-glycans 
can be classified into 4 different groups depending on the 
30 nature of the core portion of the oligosaccharides (see 
Fig. 1). Although less well studied than l\r-glycans, O- 
glycans likely have important biological functions. 
Indeed^ the presence of O-linked oligosaccharides with the 
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core 2 branch, Gall31-3 (GlcNAcfll-*6 )GalNAc , has been 
demonstrated in many biological processes. 

Filler et al., J. Biol. Ch^m 263:15146-15150 
(1988) reported that human T-cell activation is associated 
with the conversion of core 1-based tetrasaccharides to 
core 2-based hexasaccharides on leukosialin, a major 
sialoglycoprotein present on human T lymphocytes (see also 
Fig. 1) . A similar increaise in hexasaccharides vas 
observed in peripheral blood lymphocytes of patieirts 
suffering from T-cell leukemias (Saitoh et al. , Blood 
77:1491-1499 (1991)), myelogenous leukemias (Brockhausen et 
al.. Cancer Res. 51:1257-1263 (1991)) and immunodef iciescy 
due to AIDS and the Wiskott-Aldrich syndrome (Filler et 

al., Med. 173:1501-1510 (1991)). in these 

15 patients' lymphocytes, changes in the amount of 
hexasaccharides were caused by increased activity of either 
UDP-GlcNAc:Galfll-»3GalNAc (GlcNAc to GalNAc) 6-J3-D-W- 
acetylglucosaminyl transferase (EC2 . 4 . 1 . 102 ) or core 2 fli-»6 
W-acetylglucosaminyl transferase (Williams et al., j. Biol. 
20 Chesu 255:11253-11261 (1980)). Increased activity of core 
2 Jll-»6 i^-acetylglucosaminyltransf erase also was observed in 
metastatic murine tumor cell lines as compared to their 
parental, non-metastatic counterparts (Yousefi et al., 
Biol. Chem. 266:1772-1782 (1991)). 

Increased complexity of the attached 
oligosaccharides increases the molecular weight of the 
glycoprotein. For example, leukosialin containing 

hexasaccharides has a molecular weight of -135kDa, whereas 
leukosialin containing tetrasaccharides has a molecular 
weight of -lOSkDa (Carlsson et al., J. Biol. Cten.. 
261:12779-12786 and 12787-12795 (1986)). 

Fox et al., J . Immunol . 131:762-767 (1983) raised 
a monoclonal antibody, T305, against human T-lymphocytic 
leukemia cells. Sportsman et al., J. Immunol. 135:158-164 



30 
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(1985) reported T305 binding was abolished by neuraminidase 
treatment, suggesting T305 binds to hexasaccharides . T305 
specifically reacts with the high molecular weight form of 
leukosialin (Saitoh et al., supra . (1991)). 

Previous studies indicated poly-/7- 
acetyllactosamine repeats extend almost exclusively from 
the branch formed by the core 2 Bl-*6 N- 
acetylglucosaminyltransf erase (Fukuda et al., J. Biol. 
Chenu 261:12796-12806 (1986)). Consistent with these 
results, Yousefi et al., supra . (1991) demonstrated that 
the core 2 enzyme in metastatic tumor cells regulates the 
level of poly-W-acetyllactosamine synthesis in O-linked 
oligosaccharides . 

Poly-W-acetyllactosamines are subject to a 
15 variety of modifications, including the formation of the 
sialyl Le"*, NeuNAca2^3GalJ31*4 (Fucal-».3)GlcNAc-, or the sialyl 
Le', NeuNAca2-»3GalJ31-»3 (Fucal-»4 )GlcNAc-, determinants 
(Fukuda, Biochim. Bioohys. Acta 7aQinq-T^n (1985)). Such 
modifications are significant because these determinants, 
which are present on neutrophils and monocytes, serve as 
ligands for E- and P-selectin present on endothelial cells 
and platelets, respectively (see, for example, Larsen et 
al., Cell 63:467-474 (1990)). 

In addition, tumor cells often express a 
25 significant amount of sialyl Le'' and/or sialyl Le* on their 
cell surfaces. The interaction between E-selectin or P- 
selectin and these cell surface carbohydrates may play a 
role in tumor cell adhesion to endothelium during the 
metastatic process (Walz et al., supra , (1990)). Kojima et 
30 al., Biochem. Biophys. Rea. Coummn- 182:1288-1295 (1992) 
reported that selectin-dependent tumor cell adhesion to 
endothelial cells was abolished by blocking O-glycan 
synthesis. Complex sulfated O-glycans also may serve as 
ligands for the lymphocyte homing receptor, L-selectin 



20 
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(Imai et al., J. Cell Biol. 113:1213-1221 (1991)). 

These reported observations establish core 2 Bl->6 
W-acetylglucosaminyltransf erase as a critical enzyme in O- 
glycan biosynthesis. The availability of core 2 fll->6 

5 acetylglucosaniinyltransf erase will allow the in vivo and in 
^i^" production of specific glycoproteins having core 2 
oligosaccharides and subsequent study of these variant O- 
glycans on cell-cell interactions. For example, core 2 
Bl->6 W-acetylglucosaminyltransf erase is a useful marker for 

0 transformed or cancerous cells. An understanding of the 
role of core 2 Bl->6 W-acetylglucosaminyltransf erase in 
transformed and cancerous cells may elucidate a mechanism 
for the aberrant cell-cell interactions observed in these 
cells. In order to understand the control of expression of 

5 these oligosaccharides and their function, isolation of a 
cDNA clone for core 2 fll-*6 i^T-acetylglucosaminyltransf erase 
is a prerequisite. However, the DNA sequence encoding core 
2 fll^6 J/-acetylglucosaminyltransf erase has not yet been 
reported. 



Thus, a need exists for identifying the core 2 
J31-*6 N-acetylglucosaminyltransferase and the DMA sequences 
encoding this enzyme. The present invention satisfies this 
need and provides related advantages as well. 

SUMMARY OF THE INVENTION 

The present invention generally relates to a 
novel purified human fll-^6 N-acetylglucosaminyltransf erase . 
A cDNA sequence encoding a 428 amino acid protein having 
fll-j^e i^-acetylglucosaminyl transferase activity also is 
provided. The purified human 131^6 

acetylglucosaminyltransf erase , or an active fragment 
thereof, catalyzes the formation of critical branches in O- 
glycans . 
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The invention further relates to a novel purified 
acceptor molecule, leukosialin,. CD43, for core 2 131^6 
acetylglucosaminyltransf erase activity. The leukosialin 
cDNA encodes a novel variant leukosialin, which is created 
5 by alternative splicing of the genomic leukosialin DNA 
sequence. 

Isolated nucleic acids encoding either core 2 
Bl->6 If- acetylglucosaminyltransf erase or leukosialin are 
disclosed, as are vectors containing the nucleic acids and 

10 recombinant host cells transformed with such vectors. The 
invention further provides methods of detecting such 
nucleic acids by contacting a sample with a nucleic acid 
probe having a nucleotide sequence capable of hybridizing 
with the isolated nucleic acids of the present invention - 

15 The core 2 Bl->6 N-acetylglucosaminyltransf erase and 
leukosialin amino acid and nucleic acid sequences disclosed 
herein can be purified from human cells or produced using 
well known methods of recombinant DNA technology. 

The invention also discloses a method of 
20 isolating nucleic acid sequences encoding proteins that 
have an enzymatic activity. Such a nucleic acid sequence 
is obtained by transfecting the nucleic acid, which is 
contained within a vector having a polyoma virus 
replication origin, into a Chinese hamster ovary (CHO) cell 
25 line simultaneously expressing polyoma virus large T 
antigen and the acceptor molecule for the protein having an 
enzymatic activity. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 depicts the structures and biosynthesis 
30 of O-glycans. Structures of O-glycan cores can be 
classified into 4 groups (core 1 to core 4), each of which 
is synthesized starting with GalNAcal-*Ser/Thi:. The core 1 
structure is synthesized by the addition of a J31-*^3 Gal 
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residue to the GalNAc residue. The core 1 structure can be 
conveirted to core 2 by the addition of a J31-^ N- 
acetylglucosaminyl residue. This intermediate is usially 
converted to the hexasaccharide by sequential addition of 
5 galactose and sialic acid residues (bottom right) . The 
core 2 Bl-*6 W-acetylglucosaminyltransferase and the linkage 
formed by the enzyme are indicated by a box. In certain 
cell types, the core 2 structure can be extended by the 
addition of W-acetyllactosamine (GalJ31->4GlcNAcfil-*'3 ) repeats 

10 to form poly-A^-acetyllactosamine . In the absence of core 
2 Bl-»6 W-acetylglucosaminyltransf erase, core 1 is converted 
to the monosialoform, then to the disialof orm by sequential 
addition of a2^3- and a2^6-linked sialic acid residues 
(bottom left). Alternatively, core 3 can be synthesized by 

15 the addition of a fll-»3 W-acetylglucosaminyl residue to the 
GalNAc residue. Core 3 can be converted to core 4 by 
another lll-*6 W-acetylglucosaminyltransf erase (top of 
figure) . 

Figure 2 depicts genomic DNA sequence (SEQ. ID. 

20 NO. 1) and cDNA sequence (SEQ. ID. NO. 2) of leukosialin. 
The genomic sequence is numbered relative to the 
transcriptional start site. Exon 1 and exon 2 have been 
previously described. Exon 1' is newly identified here. 
In the isolated cDNA, exon 1' is immediately followed by 

25 the exon 2 sequence. Deduced amino acids are presented 
under the coding sequence, which begins in exon 2 (SEQ. ID. 
NO. 3). A portion of the exon 2 sequence is shown. 

Figure 3 establishes the ability of pGT/hCG to 
replicate in CHO cell lines expressing polyoma large T 

30 antigen and leukosialin. In panel A, six clonal CHO cell 
lines were examined for replication of pcDNAI-based pGT/hCG 
(lanes 1-6). In panel B, replication of cell clone 5 (CHO- 
Py-leu), was further examined by treatment with increasing 
concentrations of Dpnl and Xhol (lanes 2 and 3). Plasmid 

35 DNA isolated from MOP-8 cells was used as a control (lane 
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1) . Plasmid DNA was extracted using the Hirt procedure and 
samples were digested with Xhol and Dpnl, In parallel, 
pGT/hCG plasmid purified from E. coll MC1061/P3 was 
digested with Xhol smd Dpnl (lane 7 in panel A and lane 4 
5 in panel B) or Xhol alone (lane 8 in panel A and lane 5 in 
panel B) . The arrow indicates the migration of plasmid DNA 
resistant to Dpnl digestion. The arrowheads indicate 
plasmid DNA digested by Dpnl. 

Figure 4 shows the expression of T305 antigen 
expressed by pcDNAI-C2GnT. Subconfluent CHO-Py-leu cells 
were transfected with pcDNAI-C2GnT (panels A and B) or 
mock- trans feet ed with pcDNAI (panels C and D) . Sixty four 
hours after transf ection, the cells were fixed, then 
incubated with mouse T305 monoclonal antibody followed by 
fluorescein isocyanate-conjugated sheep anti-mouse IgG 
(panels A, B and C) . Two different areas are shown in 
panels A and B. Panel D shows a phase micrograph of the 
same field shown in panel C. Bar = 20^rm. 

Figure 5 depicts the cDNA sequence (SEQ. ID. NO. 
4) and translated amino acid sequences (SEQ. ID. NO. 5) of 
core 2 fll^6 i\?-acetylglucosaminyltransf erase . The open 
reading frame and full-length nucleotide secpience of C2GnT 
are shown. O^he signal /membrane- anchoring domain is doubly 
underlined. The polyadenylation signal is boxed. 

Potential 2^-glycosylation sites are marked with asterisks. 
The sequences are numbered relative to the translation 
start site. 

Figure 6 shows the expression of core 2 J31^6 N~ 
acetylglucosaminyltransf erase mRNA in various cell types. 
30 Poly(A)* RNA (11 ^g) from CHO-Py-leu cells (lane 1), HL-60 
promyelocytes (lane 2), K562 erythrocytic cells (lane 3), 
and SP and L4 colonic carcinoma cells (lanes 4 and 5) was 
resolved by electrophoresis. RNA was transferred to a 
nylon membrane and hybridized with a radiolabeled fragment 
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of pPR0TA-C2GnT. Migration of RNA size markers is 
indicated. 

Figure 7 illustrates the construction of the 
vector encoding the protein A-C2GnT fusion protein. The 
5 cDNA sequence corresponding to Pro="' to His"" was fused in 
frame with the igG binding domain of S, aureus protein A 
{bottom; SEQ. ID. NO. 6). The sequence includes the 
cleavable signal peptide, which allows secretion of the 
fused protein. The coding sequence is under control of the 
SV40 promoter. The remainder of the vector sequence shown 
was derived from rabbit fl-globin gene sequences, including 
an intervening sequence (IVS) and a polyadenylation signal 
(An) . 
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DETATI.ED DESCRIPTION OF THE INVENTION 



The present invention generally relates to a 
novel human core 2 J31-*6 W-acetylglucosaminyltransf erase. 
The invention further relates to a novel method of 
transient expression cloning in CHO cells that was used to 
isolate the cDNA sequence encoding humem core 2 Bl-»6 N- 
20 acetylglucosaminy Itransf erase (C2GnT) . The invention also 
relates to a novel human leukosialin, which is an acceptor 
molecule for core 2 fll-6 W-acetylglucosaminy Itransf erase 
activity. 



Cells generally contain extremely low amounts of 
glycosy Itransf erases. As a result, cDNA cloning based on 
screening using an antibody or a probe based on the 
glycosyltransferase amino acid sequence has met with 
limited success. However, isolation of cDNAs encoding 
various glycosyltransf erases can be achieved by transient 
expression of cDNA in recipient cells. 

Successful application of the transient 
expression cloning method to isolate a cDNA sequence 
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encoding a glycosyltransf erase requires an appropriate 
recipient cell line. Ideal recipient cells should not 
express the glycosyltransf erase of interest. As a result, 
the recipient cells would normally lack the oligosaccharide 
5 structure formed by such a glycosyltransf erase. 

Expression of the cloned glycosyltransf erase cDNA 
in the recipient cell line should result in formation of 
the specific oligosaccharide structure. The resultant 
oligosaccharide can be identified using a specific antibody 
10 or lectin that recognizes the structure. The recipient 
cell line also must support replication of an appropriate 
plasmid vector. 

COS-1 cells initially appear to satisfy the 
requirements for using the transient expression method. 
COS-1 cells express SV40 large T antigen and support the 
replication of plasmid vectors harboring a SV40 replication 
origin (Gluzman et al.. Cell 23:175-182 (1981)). Although 
COS-1 cells, themselves, express a variety of 
glycosyltransf erases, COS-1 cells have been used to clone 
cDNA sequences encoding human blood group Lewis al-^3/4 
fucosyltransf erase and murine al-*3 galactosyltransf erase 
(Kukowska-Latallo et al.. Genes and Devel. 4:1288-1303 
(1990); Larsen et al., Proc. Natl. Acad. Sci. USA 86:8227- 
8231 (1989)). Also, Goelz et al.. Cell 63:175-182 (1990), 
utilized an antibody that inhibits E-selectin mediated 
adhesion to isolate a cDNA sequence encoding al->3 
fucosyltransf erase • 

An attempt was made to use COS-1 cells to isolate 
cDNA clones encoding core 2 fll->6 W- 
30 acetylglucosciminyltransf erase. COS-1 cells were 

transfected using cDNA obtained from activated human T 
cells, which express the core 2 i31->6 W- 
acetylglucosEuainyltransf erase . Transfected cells suspected 
of expressing core 2 fll-*6 W-acetylglucosaminyltransf erase 
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in the transf ected cells were identified by the presence of 
increased levels of the core 2 oligosaccharide structure 
formed by core 2 Rl-*6 W-acetylglucosaminyltransf exase 
activity. The presence of the core 2 structure was 
identified using the monoclonal antibody, T305, which 
identifies a hexasaccharide on leukosialin. A clone 
expressing high levels of the T305 antigen was isolated and 
sequenced. 

Surprisingly, transf ection using COS-1 cells 
resulted in the isolation of a cDNA clone encoding a noeel 
variant of human leukosialin, which is the acceptor 
molecule for core 2 lH-*6 W-acetylglucosaminyltransf erase 
activity. Examination of the cDNA sequence of the newly 
isolated leukosialin revealed the cDNA sequence was formed 
as a result of alternative splicing of exons in the genomic 
leukosialin DMA sequence. Specifically, the newly isolated 
leukosialin is encoded by cDNA sequence containing a 
previously undescribed non-coding exon at the 5 '-terminus 
(exon 1' in Figure 2; SEQ. ID. NO. 1 and SEQ. ID. NO. 2). 

The unexpected result obtained using COS-1 cells 
led to the development of a new transfection system to 
isolate a cDNA sequence encoding core 2 I31-*6 N- 
acetylglucosaminyltransf erase. CHO cells, which do not 
normally express the T305 antigen, were transfected wLth 
DNA sequences encoding human leukosialin and the poljoma 
virus large T antigen. A cell line, designated CHO-Py-leu, 
which expresses human leukosialin and polyoma virus large 
T antigen, was isolated. 

CHO-Py-leu cells were used for transient 
expression cloning of a cDNA sequence encoding core 2 Jll-»6 
W-acetylglucosaminyltransf erase. CHO-Py-leu cells were 
transfected with cDNA obtained from human HE^SO 
promyelocytes. A plasmid, pcDNAI-C2Gnt , which directed 
expression of the T305 antigen, was isolated and the cDNA 
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insert was sequenced (see Figure 5; SEQ. ID. NO. 4). The 
2105 base pair cDNA sequence encodes a putative 428 amino 
acid protein. The genomic DNA sequence encoding can be 
isolated using methods well known to those skilled in the 
5 art, such as nucleic acid hybridization using the core 2 
Bl-»6 W-acetylglucosaminyl transferase cDNA disclosed herein 
to screen, for example, a genomic library prepared from HL- 
60 promyelocytes. 



10 



An enzyme similar to the disclosed human core 2 
Bl-^e W-acetylglucosaminyltransferase has been purified from 
bovine tracheal epithelium (Ropp et al., J. Biol. Chem. 
266:23863-23871 (1991), which is incorporated herein by 
reference. The apparent molecular weight of the bovine 
enzyme is -69kDa. In comparison, the predicted molecular 
15 weight of the polypeptide portion of core 2 131^6 N- 
acetylglucosaminyltransf erase is -50kDa. The deduced amino 
acid sequence of core 2 13 1 -> 6 N- 
acetylglucosaminyl transferase reveals two to three 
potential W-glycosylation sites, suggesting W-glycosylation 
20 and O-glycosylation, or other post-translational 
modification, could account for the larger apparent size of 
the bovine enzyme . 



Expression of the cloned C2GnT sequence, or a 
fragment thereof, directed formation of the specific O- 
glycan core 2 oligosaccharide structure. Although several 
cDNA sequences encoding glycosyltransf erases have been 
isolated (Paulson and Colley, J. Biol. Chem. 264:17615- 
17618 (1989); Schachter, Curr. Onin. Struct. Biol. 1:755- 
765 (1991), which are incorporated herein by reference), 
C2GnT is the first reported cDNA sequence encoding an 
enzyme involved exclusively in O-glycan synthesis. 



In O-glycans, 131^6 W-acetylglucosaminyl linkages 
may occur in both core 2, Galfll-*3 (GlcNAcfllH.6)GalNAc, and 
core 4, GlcNAcl31-»3 (GlcNAcJ31^6 ) GalNAc, structures 
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(Brockhausen et al.. Biochemist: r-y 24:1866-1874 (1985), 
which is incorporated herein by reference. In addition, 
fll-»6 W-acetylglucosaminyl linkages occur in the side chains 
of poly-W-acetyllactosamine, forming the I-structure 

5 (Pillar et al., Jj Biol. Chem. 259:13385-13390 (1984), 

which is incorporated herein by reference) , and in the side 
chain attached to a-mannose of the W-glycan core structure, 
forming a tetraantennary saccharide (Cununings et al., J. 
Biol. Chem. 257:13421-13427 (1982), which Is incorporated 
10 herein by reference). The enzymes responsible for these 
linkages all share the unique property that Mn'+ is not 
required for their activity. 

Although it was originally suggested that these 
J31^6 W-acetylglucosaminyl linkages were formed by the same 
enzyme (Filler at al., 1984), the present disclosure 
clearly demonstrates that the HL-60-derived core 2 Bl-»>6 N- 
acetylglucosaminyltransferase is specific for the formation 
only of O-glycan core 2. This result is consistent with a 
recent report demonstrating that myeloid cell lysates 
contain the enzymatic activity associated with core 2, but 
not core 4, formation (Brockhausen et al., supra . (1991)). 



15 



20 



25 



Analysis of mRNA isolated from colonic cancer 
cells indicated core 2 J31^6 W-acetylglucosaminyltransf erase 
±s expressed in these cells. Recent studies using affinity 
absorption suggested at least two different fll-*6 W- 
acetylglucosaminyltransferases were present in tracheal 
epithelium (Ropp et al., supra . (1991)). One of these 
transferases formed core 2, core 4, and I structures. 
Thus, at least one other J3 1 -» 6 N- 
acetylglucosaminyltransferase present in epithelial cells 
can form core 2, core 4 and I structures. Similarly, a 
JT-acetylglucosaminyltransferase present in Novikoff 
hepatoma cells can form both core 2 and I structures 
(Koenderman et al., Eur. J. Biochem. 166:199-208 (1987), 
35 which is incorporated herein by reference). 



30 
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The acceptor molecule specificity of core 2 Bl-»6 
W-acetylglucosaminyltransf erase is different from the 
specificity of the enzymes present in tracheal epithelium 
and Novikoff hepatoma cells. Thus, a family of Bl-»6 W- 
acetylglucosaminyltransferases can exist, the members of 
which differ in acceptor specificity but are capable of 
forming the seune linkage. Members of this family are 
isolated from cells expressing Bl-»6 W- 
acetylglucosaminyl transferase activity, using, for example, 
nucleic acid hybridization assays and studies of acceptor 
molecule specificity. Such a family was reported for the 
ol-»>3 fucosyltransf erases (Weston et al., J. Biol. Chem. 
267:4152-4160 (1992), which is incorporated herein by 
reference) . 

The formation of the core 2 structure is critical 
to cell structure and function. For example, the core 2 
structure is essential for elongation of poly-W- 
acetyllactoscunine and for formation of sialyl Le* or sialyl 
Le» structures. Furthermore, the biosynthesis of cartilage 
keratan sulfate may be initiated by the core 2 J31^6 N- 
acetylglucosaminyl transferase, since the keratan sulfate 
chain is extended from a branch present in core 2 structure 
in the same way as poly-j/-acetyllactosamine (Dickenson et 
al., Biochem. J. 269:55-59 (1990), which is incorporated 
herein by reference) . Keratan sulfate is absent in wild- 
type CHO cells, which do not express the core 2 Jil-»6 W- 
acetylglucosaminyl transferase (Esko et al., J. Biol. Chem. 
261:15725-15733 (1986), which is incorporated herein by 
reference). These structures are believed to be important 
for cellular recognition and matrix formation. The 
availability of the cDNA clone encoding the core 2 131-»6 W- 
acetylglucosaminyltransf erase will aid in understanding how 
the various carbohydrate structures are formed during 
differentiation and malignancy. Manipulation of the 
expression of the various carbohydrate structures by gene 
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transfer and gene inactivation methods will help elucidate 
the various functions of these structures. 

The present invention is directed to a method for 
transient expression cloning in CHO cells of cDNA sequences 
encoding proteins having enzymatic activity. Isolation of 
human core 2 fll->6 W-acetylglucosaminyltransf erase is 
provided as an example of the disclosed method. However, 
the method can be used to obtain cDNA sequences encoding 
other proteins having enzymatic activity. 



For example, lectins and antibodies reactive with 
other specific oligosaccharide structures are available and 
can be used to screen for glycosyltransf erase activity. 
Also, CHO cell lines that have defects in glycosylation 
have been isolated. These cell lines can be used to study 
15 the activity of the corresponding glycosyltransf erase 
(Stanley, Ann. Rev. Genet. 18:525-552 (1984), which is 
incorporated herein by reference). CHO cell lines also 
have been selected for various defects in cellular 
metabolism, loss of expression of cell surface molecules 
and resistance to cytotoxic drugs (see, for example, 
Malmstrom and Krieger, J. Biol. CM^m. 266:24025-24030 
(1991); Yayon et al.. Cell 64:841-848 (1991), which are 
incorporated herein by reference) . The approach disclosed 
herein should allow isolation of cDNA sequences encoding 
25 the proteins involved in these various cellular functions. 

As used herein, the terms "purified" and 
"isolated" mean that the molecule or compound is 
substantially free of contaminants normally associated with 
a native or natural environment. For example, a purified 
protein can be obtained from a number of methods. The 
naturally-occurring protein can be purified by any means 
known in the art, including, for example, by affinity 
purification with antibodies having specific reactivity 
with the protein. In this regard, anti-core 2 J31-6 N- 
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acetylglucosaminyltransf erase antibodies can be used to 
substantially purify naturally-occurring core 2 Dl-*6 
acetylglucosaminyltransf erase from human HL-60 
promyelocytes . 

5 Alternatively^ a purified protein of the present 

invention can be obtained by well known recombinant 
methods, utilizing the nucleic acids disclosed herein, as 
described, for example, in Sambrook et al.. Molecular 
Cloning; A Laboratory Manual 2d ed. (Cold Spring Harbor 
10 Laboratory 1989), which is incorporated herein by 
reference, and by the methods described in the Examples 
below. Furthermore, purified proteins can be synthesized 
by methods well known in the art. 

As used herein, the phrase "substantially the 

15 sequence" includes the described nucleotide or amino acid 
sequence and sequences having one or more additions, 
deletions or substitutions that do not substantially affect 
the ability of the sequence to encode a protein have a 
desired functional activity. In addition, the phrase 

20 encompasses any additional sequence that hybridizes to the 
disclosed sequence under stringent hybridization sequences. 
Methods of hybridization are well known to those skilled in 
the art. For example, sequence modifications that do not 
substantially alter such activity are intended. Thus, a 

25 protein having substantially the amino acid sequence of 
Figure 5 (SEQ. ID. NO. 5) refers to core 2 fll-^6 W- 
acetylglucosaminyltransf erase encoded by the cDNA described 
in Example IV, as well as proteins having eonino acid 
sequences that are modified but, nevertheless, retain the 

30 functions of core 2 Bl-»6 //-acetylglucosaminyltransf erase . 
One skilled in the art can readily determine such retention 
of function following the guidance set forth, for example, 
in Examples V and VI. 
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The present invention is further directed to 
active fragments of the human core 2 fll-*6 N- 
acetylglucosaminyltransf erase protein. As used herein, an 
active fragment refers, to portions of the protein that 
substantially retain the glycosyltransf erase activity of 
the intact core 2 Bl-e W-acetylglucosaminyltransf erase 
protein. One skilled in the art can readily identify 
active fragments of proteins such as core 2 fll-»6 N- 
acetylglucosaminyltransferase by comparing the activities 
of a selected fragment with the intact protein following 
the guidance set forth in the Examples below. 



As used herein, the term "glycosyltransf erase 
activity" refers to the function of a glycosyltransf erase 
to link sugar residues together through a glycosidic bond 
15 to create critical branches in oligosaccharides. 
Glycosyltransferase activity results in the specific 
transfer of a monosaccharide to an appropriate acceptor 
molecule, such that the acceptor molecule contains 
oligosaccharides having critical branches. One skilled in 
the art would understand the terms "enzymatic activity" and 
"catalytic activity" to generally refer to a function of 
certain proteins, such as the function of those proteins 
having glycosyltransferase activity. 



As used herein, the term "acceptor molecule" 
refers to a molecule that is acted upon by a protein having 
enzymatic activity. For example, an acceptor molecule, 
such as leukosialin, as identified by the amino acid 
sequence of Figure 2 (SEQ. ID. NO. 3), accepts the transfer 
of a monpsaccharide due to glycosyltransferase activity. 
An acceptor molecule, such as leukosialin, may already 
contain one or more sugar residues. The transfer of 
monosaccharides to an acceptor molecule, such as 
leukosialin, results in the formation of critical branches 
of oligosaccharides. 
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As used herein^ the term "critical branches" 
refers to oligosaccharide structures formed by specific 
glycosyltransf erase activity. Critical branches may be 
involved in various cellular functions^ such as cell-cell 
5 recognition. The oligosaccharide structure of a critical 
branch can be deteirmined using methods well known in the 
art, such as the method for determining the core 2 
oligosaccharide structure, as described in Examples V and 
VI. 



Relatedly, the invention also provides nucleic 
acids encoding the human core 2 Bl->6 N~ 
acetylglucosaminyltransf erase protein and leukosialin 
protein described above. The nucleic acids can be in the 
form of DNA, RNA or cDNA, such as the novel C2GnT cDKA of 
2105 base pairs identified in Figure 5 (SEQ, ID. NO. 4) or 
the novel leukosialin cDNA identified in Figure 2 (SEQ. ID. 
NO. 2), for example. Such nucleic acids can also be 
chemically synthesized by methods known in the art, 
including, for example, the use of an automated nucleic 
acid synthesizer. 

The nucleic acid can have substantially the 
nucleotide sequence of C2GnT, identified in Figure 5 (SEQ. 
ID. NO. 4), or leukosialin identified in Figure 2 (SEQ. ID. 
NO. 2). Portions of such nucleic acids that encode active 
25 fragments of the core 2 fl 1 6 

acetylglucosaminyltransferase protein or leukosialin 
protein of the present invention also are contemplated. 

Nucleic acid probes capable of hybridizing to the 
nucleic acids of the present invention under reasonedaly 
30 stringent conditions can be prepared from the cloned 
sequences or by synthesizing oligonucleotides by methods 
known in the art. The probes can be labeled with markers 
according to methods known in the art and used to detect 
the nucleic acids of the present invention. Methods for 
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detecting such nucleic acids can be accomplished by 
contacting the probe with a sample containing or suspected 
of containing the nucleic acid under hybridizing 
conditions, and detecting the hybridization of the probe to 
the nucleic acid. 

The present invention is further directed to 
vectors containing the nucleic acids described above. The 
term "vector" includes vectors that are capable of 
expressing nucleic acid sequences operably linked to 
regulatory sequences capable of effecting their expression. 
Numerous cloning vectors are known in the art. Thns, the 
selection of an appropriate cloning vector is a matter of 
choice. In general, useful vectors for recombinant DMA are 
often plasmids, which refer to circular double stranded DMA 
15 loops such as pcDNAI or poDSRa. As used herein, "plasmid" 
and "vector" may be used interchangeably as the plasmid is 
a common form of a vector. However, the invention is 
intended to include other forms of expression vectors that 
seirve equivalent functions. 
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Suitable host cells containing the vectors of the 
present invention are also provided. Host cells can be 
transformed with a vector and used to express the desired 
recombinant or fusion protein. Methods of recombinant 
expression in a variety of host cells, such as maamalian, 
yeast, insect or bacterial cells are widely known. For 
example, a nucleic acid encoding core 2 fll-»6 N- 
acetylglucoaaminyltransferase or a nucleic acid encoding 
leukosialin can be transfected into cells using the calcium 
phosphate technique or other transfection methods, such as 
those described in Sambrook et al., supra . (1989). 

Alternatively, nucleic acids can be introduced 
into cells by infection with a retrovirus carrying the gene 
or genes of interest. For example, the gene can be cloned 
into a plasanid containing retroviral long terminal repeat 
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sequences, the C2Gnt: DNA sequence or the leukosialin DNA 
sequence, and an antibiotic resistance gene for selection • 
The construct can then be transfected into a suitable cell 
line, such as PA12, which carries a packaging deficient 
5 provirus and expresses the necessary components for virus 
production, including synthesis of amphotrophic 
glycoproteins. The supernatant from these cells contain 
infectious virus, which can be used to infect the cells of 
interest. 

Isolated recombinant polypeptides or proteins can 
be obtained by growing the described host cells under 
conditions that favor transcription and translation of the 
transfected nucleic acid. Recombinant proteins produced by 
the transfected host cells are isolated using methods set 
forth herein and by methods well known to those skilled in 
the art . 

Also provided are antibodies having specific 
reactivity with the core 2 J51-»6 
acetylglucosaminyltransferase protein or leukosialin 
protein of the present invention. Active fragments of 
antibodies, for example. Fab and Fab'^ fragments, having 
specific reactivity with such proteins are intended to fall 
within the definition of an "antibody." Antibodies 
exhibiting a titer of at least about 1.5 x 10^, as 
determined by ELISA, sure useful in the present invention. 

The antibodies of the invention can be produced 
by any method known in the art. For example, polyclonal 
and monoclonal antibodies can be produced by methods 
described in Harlow and Lane, Antibodies; A Laboratory 
30 Manual (Cold Spring Harbor 1988), which is incorporated 
herein by reference. The proteins, particularly core 2 
J31-*6 N-acetylglucoscuainyltransf erase or leukosialin of the 
present invention can be used as inununogens to generate 
such antibodies. Altered antibodies, such as chimeric. 
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humanized, CDR-grafted or bifunctional antibodies can also 
be produced by methods well known to those skilled in the 
art. Such antibodies can also be produced by hybridoma, 
chemical synthesis or recombinant methods described, for 
example, in Sambrook et al., supra . (1989). 

The antibodies can be used for determining the 
presence or purification of the core 2 JJl-^S N-- 
acetylglucosaminyltransf erase protein or the leukosialin 
protein of the present invention. With respect to the 
detecting of such proteins, the antibodies can be used for 
in vitro or in vivo methods well known to those skilled in 
the art. 

Finally, kits useful for carrying out the methods 
of the invention are also provided. The kits can contain 
15 a core 2 Bl-^6 i\r-acetylglucosaminyltransf erase protein, 
antibody or nucleic acid of the present invention and an 
ancillary reagent. Alternatively, the kit can contain a 
leukosialin protein, antibody or nucleic acid of the 
present invention and an ancillary reagent. An ancillary 
reagent may include diagnostic agents, signal detection 
systems, buffers, stabilizers, pharmaceutically acceptable 
carriers or other reagents and materials conventionally 
included in such kits. 
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A cDNA sequence encoding core 2 N- 
acetylglucosaminyltransferase was isolated and core 2 Bl-»6 
iV- acetylglucosaminyltransf erase activity was determined. 
This is the first report of transient expression cloning 
using CHO cells expressing polyoma large T antigen. The 
following examples are intended to illustrate but not limit 
30 the present invention. 
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EXAMPLE I 

EXPRESSION CLONING IK ptct.t. s OF THE cDNA FOR THE 

PROTEIN CARRYING THE HEXASACCHARIDES 

COS-1 cells were transf acted with a cDNA library, 
5 pcDSIla-2Fl, constructed from poly (A) RNA of activated T 
lyn^hocytes^ which express the core 2 J31->6 
acetylglucosaminyltransf erase (Yokota et al.^ Proc , Natl . 
Acad. Scl. USA 83:5894-5898 (1986); Filler et al*, supra . 
(1988), which are incorporated herein by reference). COS-1 

10 cells support replication of the pcDSRa constructs, which 
contain the SV40 replication origin. Treinsfected cells 
were selected by panning using monoclonal antibody T305, 
which recognizes sialylated branched hexasaccharides 
(Filler et al., supra > (1991); Saitoh et al. , supra , 

15 (1991)). Methods referred to in this example are described 
in greater detail in the examples that follow. 

Following several rounds of transf ection, one 
plasmid, pcDSRa-leu, directing high expression of the T305 
antigen was identified. The cloned cDNA insert was 

20 isolated and sequenced, then compared with other reported 
sequences. The newly isolated cDNA sequence was nearly 
identical to the sequence reported for leukosialin, except 
the 5 '-flanking sequences were different (Pallant et al., 
Proc. Natl. Acad . Sci. USA 86:1328-1332 (1989), which is 

25 incorporated herein by reference). 

Comparison of the cloned cDNA sequence with the 
genomic leukosialin DNA sequence revealed the start site of 
the cDNA sequence is located 259 bp upstream pf the 
transcription start site of the previously reported 
30 secjuence (Figure 2; compare Exon 1' and Exon 1) (Shelley et 
al., Biochem. J, 270:569-576 (1990); Kudo and Fukuda, J^. 
Biol, Chem. 266:8483-8489 (1991), which are incorporated 
herein by reference). A consensus splice site was 
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identified at the exon-intron junction of the newly 
identified 122 bp axon 1' in pcDSRa-Ieu (Breathnacli and 
Chambon, Ann. Rev . Biochem. 50:349-383 (1981), which is 
incorporated herein by reference) . This splice site is 
followed by the exon 2 sequence. 

These results indicate the T305 antibody 
preferentially binds to branched hexasaccharides attached 
to leukosialin. Indeed, a small amount of the 

hexasaccharides (approximately 8% of the total) was 
detected in O-glycans isolated from control COS-1 cells. 
T305 binding is similar to anti-M and anti-N antibodies, 
which recognize both the glycan and polypeptide portions of 
erythrocyte glycoprotein, glycophorin (Sadler et al. , 
Biol . Chem 254: 2112-2119 (1979), which is incorporated 
15 herein by reference). These observations are consistent 
with reports that only leukosialin strongly reacted with 
T305 in Western blots of leukocyte cell extracts, even 
though leukocytes also express other glycoproteins, such as 
CD45, that must also contain the same hexasaccharides 
20 (Filler et al., supra, (1991); Saitoh et al., supra . 
(1991)). 



10 



EXAMPLE II 

ESTABLISHMENT OP CH Q CEU. LINES THAT STABLY EXPRBSS 
POLYOMA VIRUS LA RGE T ANTIGEN AND LEUKOSIALIN 

preferentially binds to branched 
hexasaccharides attached to leukosialin. Such 
hexasaccharides are not present on the erythropoietin 
glycoprotein produced in CHO cells, although the 
glycoprotein does contain the precursor tetrasaccharide 
30 (Sasaki et al., J. Biol, r-h^m. 262:12059-12076 (1987), 
which is incorporated herein by reference) . T305 antigen 
also is not detectable in CHO cells transiently transfected 
with pcDSRa-leu. In order to screen for the presence of a 
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cDNA clone expressing core 2 J31->6 N~ 
acetylglucosaminyltransf erase activity, a CHO cell line 
expressing both leukosialin and polyoma large T antigen was 
established (see^ for exaa5)le, Heffernan and Dennis Nucl> 
5 Acids Res> 19:85-92 (1991)^ which is incorporated herein by 
reference) . 

Vectors ; A plasmid vector, pPSVEl-PyE, which contains 

the polyoma virus early genes under the control of the SV40 
early promoter, was constructed using a modification of the 

10 method of Muller et al., Mol, Cell. Biol. 4:2406-2412 
(1984), which is incorporated herein by reference • Plasmid 
pPSVEl was prepared using pPSG4 (American Type Culture 
Collection 37337) and SV40 viral DNA (Bethesda Research 
laboratories) essentially as described by Featherstone et 

15 al., Nucl. Acids Res. 12:7235-7249 (1984), which is 
incorporated herein by reference. Following EcoRI and 
Hindi digestion of plasmid pPyLT-1 (American Type Culture 
Collection 41043), a DNA sequence containing the carboxy 
terminal coding region of polyoma virus large T antigen was 

20 isolated. The Hindi site was converted to an EcoRI site 
by blunt-end ligation of phosphorylated EcoRI linkers 
(Stratagene) . Plasmid pPSVEl-PyE was generated by 

inserting the carboxy-terminal coding sequence for large T 
antigen into the unique EcoRI site of plasmid pPSVEl. 

25 Plasmid pZIPNEO-leu was constructed by 

introducing the EcoRI fragment of PEER- 3 cDNA, which 
contains the complete coding sequence for human 
leukosialin, into the unique EcoRI site of plasmid pZIPNEO 
(Cepko et al-. Cell 37:1053-1063 (1984), which is 

30 incorporated herein by reference). Plasmid structures were 
confirmed by restriction mapping and by sequencing the 
construction sites. pZIPNEO was kindly provided by Dr. 
Channing Der. 

Trans feet ion; CHODG44 cells were grown in 100 mm tissue 



wo 94/67917 



PCr/US93/09303 



24 

culture plates. When the cells were 20% confluent r they 
were co-transf ected with a 1:4 molar ratio of pZIPNEO-leu 
and pPSVEl-PyE using the calcium phosphate technique 
(Graham and van der Eb^ Virology 52:456-467 (1973), which 
5 is incorporated herein by reference) . Transf ected cells 
were isolated and maintained in medium containing 400 ;ig/ml 
G-418 (active drug), 

Iieukosialin expression ; The total pool of G418- 

resistant transf ect ants was enriched for human leukosialin 

10 expressing cells by a one-step panning procedure using 
anti-leukosialin antibodies and goat anti-rabbit IgG coated 
panning dishes (Sigma) (Carlsson and Fukuda J. Biol> Chem. 
261:12779-12786 (1986), which is incorporated herein by 
reference). Clonal cell lines were obtained by limiting 

15 dilution. Six clonal cell lines expressing human 

leukosialin on the cell surface were identified by indirect 
immunofluorescence and isolated for further studies 
(Williams and Fukuda J. Cell Biol> 111:955-966 (1990), 
which is incorporated herein by reference). 

20 Polyoma virus-mediated replication: The ability of the 

six clonal cell lines to support polyoma virus large T 
antigen-mediated replication of plasmids was assessed by 
determining the methylation status of transfected plasmids 
containing a polyoma virus origin of replication (Muller at 

25 al,, supra, 1984; Heffernan and Dennis, supra , 1991). 
Plasmid pGT/hCG contains a fused fll^4 galactosyltransf erase 
and human chorionic gonadotropin a-chain DNA sequence 
inserted in plasmid pcDNAI, which contains a polyoma virus 
replication origin (Aoki et al., Proc. Natl. Acad. Sci., 

30 USA 89, 4319-4323 (1992), which is incorporated herein by 
reference) • 

Plasmid pGT/hCG was isolated from methylase- 
positive E. coli strain MC1061/P3 (Invitrogen) , which 
methylates the adenine residues in the Dpnl recognition 
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site, "GATC". The methylated Dpnl recognition site is 
susceptible to cleavage by Dpnl. In contrast^ the Dpnl 
recognition site of plasmids replicated in mammalian cells 
is not methylated and, therefore, is resistant to Dpnl 
digestion • 

Methylated plasmid pGT/hCG Wcis transfecteid by 
lipof ection into each of the six selected clonal cell lines 
eaqiressing leukosialin. After 64 hr, low molecular weight 
plasmid DNA was isolated from the cells using the method of 
Hirt, J, Mol. Biol. 26:365-369 (1967), which is 
incorporated herein by reference. Isolated plasmid DNA was 
digested with Xhol and Dpnl (Stratagene) , subjected to 
electrophoresis in a 1% agarose gel, and transferred to 
nylon membranes (Micron Separations Inc., MA). 

A 0.4 kb Smal fragment of the fll-»4 
galactosyltransf erase DNA sequence of pGT/hCG was 
radiolabeled with ["P]dCTP using the random primer method 
(Peinberg and Vogelstein, Anal. Biochem. 132:6-13 (1983), 
which is incorporated herein by reference) . Hybridization 
2 0 was performed using methods well-known to those skilled in 
the ari: (see, for example, Sambrook et al., supra, (1989)). 
Following hybridization, the membranes were washed several 
times, including a final high stringency wash in 0.1 x 
SSPE, 0.1% SDS for 1 hr at 65*»C, then exposed to Kodak X-AR 
25 film at -70<>C. 

Four of the six clones tested supported 
replication of the pcDNAI-based plasmid, pGT/hCG (Fig. 
3 .A., lanes 1, 3, 4 and 5). MOP-8 cells, a 3T3 cell line 
transformed by polyoma virus early genes (Muller et al., 
30 supra, (1984)), expresses endogenous core 2 fll-»6 N- 
acetylglucosaminyltransferase activity a.nd was used as a 
control for the replication assay (Fig. 3*B., lane 1). One 
clonal cell line that supported pGT/hCG replication, CHO- 
Py-leu (Fig, 3. A., lane 5; Fig. 3.B., lanes 2 and 3) and 
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expressed a significant amount of leukosialin, was selected 
for further studies. pGT/hCG was kindly provided by Dr. 
Michiko Fukuda. 



EXAMPLE III 

^ ISOIATIOKT OF A cDMA SEOUEMC g DIRECTING EXPRESSION OP THE 

HEXASACCHARIDE OM T.lgUKOSIAIilH 

Poly (A) + RNA was isolated from HL-60 
promyelocytes, which contain a significant amount of the 
core 2 J31-*6iV-acetylglucosaminyltransf erase (Saitoh et al. , 
10 supra , (1991)). A cDNA expression library, pcDNAI -HL-60 , 
was prepared (Invitrogen) and the library was screened for 
clones directing the expression of the T305 antigen. 

Plasmid DNA from the pcDNAI-HL-60 cDNA library 
was transfected into CHO-Py-leu cells using a modification 
of the lipofection procedure, described below (Feigner et 
al*/ Proc. Natl. Acad. Sci. USA 84:7413-7417 (1987), which 
is incorporated herein by reference) . CHO-Py-leu cells 
were grown in 100 mm tissue culture plates. When the cells 
were 20% confluent, they were washed twice with Opti-MEM I 
20 (GIBCO). Fifty pg of lipofectin reagent (Bethesda Research 
Laboratories) and 20 yg of purified plasmid DNA were each 
diluted to 1.5 ml with Opti-MEM I, then mixed and added to 
the cells. After incubation for 6 hr at 37 'C, the medium 
was removed, 10 ml of complete medium was added and 
25 incubation was continued for 16 hr at 37 'C. The medium was 
then replaced with 10 ml of fresh medium. 

Following a 64 hr period to allow transient 
expression of the transfected plasmids, the cells were 
detached in PBS/5mM EDTA, pH7.4, for 30 min at 37«»C, 
30 pooled, centrifuged and resuspended in cold PBS/lOmM 
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EDTA/5% fetal calf serum, pH7.4, containing a 1:200 
dilution of ascites fluid containing T305 monoclonal 
antibody. The cells were incubated on ice for 1 hr, then 
washed in the same buffer and panned on dishes coated with 
5 goat anti-mouse IgG (Sigma) (Wysocki and Sato Proc . Natl > 
Acad. Sci. USA 75:2844-2848 (1978); Seed & Aruffo Proc. 
Natl. Acad. Sci. USA 84:3365-3369 (1987), which are 
incorporated herein by reference) . T305 monoclonal 
antibody was kindly provided by Dr. R.I. Fox, Scripps 
10 Research Foundation, La Jolla, CA. 

Plasmid DNA was recovered from adherent cells by 
the method of Hirt, supra . (1967), treated with Dpnl to 
eliminate plasmids that had not replicated in transfected 
cells, and transformed into E. coll strain MC1061/P3. 

15 Plasmid DNA was then recovered and subjected to a second 
round of screening. E. coll transf ormants containing 
plasmids recovered from this second enrichment were plated 
to yield 8 pools of approximately 500 colonies each. 
Replica plates were prepared using methods well-known to 

20 those skilled in the art (see, for example, Sambrook et 
al., supra, (1989)). 

The pooled plasmid DNA was prepared from replica 
plates and transfected into CHO-Py-leu cells. The 
trans fectants were screened by panning. One plasmid pool 

25 was selected and subjected to three subsequent rounds of 
selection. One plasmid, pcDNAI-C2GnT, which directed the 
expression of the T305 antigen, was isolated. CHO-Py-leu 
cells transfected with pcDNAI-C2GnT express the antigen 
recognized by T305^ whereas CHO-Py-leu cells transfected 

30 with pcDNAI are negative for T305 antigen (Fig. 4). These 
results show pcDNAI-C2GnT directs the expression of a new 
determinant on leukosialin that is recognized by T305 
monoclonal antibody. This determinant is the branched 
hexas acch aride sequence, 

35 NeuNAca2-^3Gali31^3 (NeuNAca2-*3Gali31^4 GlcNAcfll-^6 ) GalNAc . 
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EXAMPLE IV 

CHARACTERIZATION OF C2GnT 

DHA sequence; The cDNA insert in plasmid pcDNAI-C2GnT 

was sequenced by the dideoxy chain termination method using 
5 Seguenase version 2 reagents (United States Biochemicals ) 
(Sanger et al., Proc. N atl. Acad. Sci. USA 74:5463-5467 
(1977), which is incorporated herein by reference). Both 
strands were sequenced using 17-mer synthetic 
oligonucleotides, which were synthesized as the sequence of 
10 the cDNA insert became known. 

Plasmid pcDNAI-C2GnT contains a 2105 base pair 
insert (Fig. 5). The cDNA sequence ends 1878 bp downstream 
of the putative translation start site. A polyadenylation 
signal is present at nucleotides 1694-1699. The 
15 significance of the large number of nucleotides between the 
polyadenylation signal and the beginning of the polyadenyl 
chain is not clear. However, this sequence is A/T rich. 

Deduced amino acid seauenee. The cDNA insert in 

plasmid pcDNAI-C2GnT encodes a single open reading frame in 
the sense orientation with respect to the pcDNAI promoter 
(Fig. 5). The open reading frame encodes a putative 428 
amino acid protein having a molecular mass of 49,790 
daltons . 



20 



25 



30 



Hydropathy analysis indicates the predicted 
protein is a type II transmembrane molecule, as are all 
previously reported mammalian glycosyltransf erases 
(Schachter, supra, (1991)). In this topology, a nine amdno 
acid cytoplasmic NHj-terminal segment is followed by a 23 
amino acid transmembrane domain flanked by basic amino acid 
residues. The large COOH-terminus consists of the stem and 
catalytic domains and presumably faces the lumen of the 
Golgi complex. 
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The putative protein contains three potential N~ 
glycosylation sites (Fig. 5, asterisks). However^ one of 
these sites contains a proline residue adjacent to 
asparagine and is not likely utilized in vivo . 

5 No matches were obtained when the C2GnT cDNA 

sequence and deduced amino acid sequence were compared with 
sequences listed in the PC/Gene 6.6 data bank. In 
particular / no homology was revealed between the deduced 
amino acid sequence of C2GnT and other 
10 glycosyltransferases^ including 

acetylglucosaminyltransf erase I (Sarkar et al., Proc. Natl. 
Acad. Sc i, USA 88:234-238 (1991), which is incorporated 
herein by reference). 

mRNA expression; Poly (A) ^ RNA was prepared using a kit 

(Stratagene) and resolved by electrophoresis on a 1.2% 
agarose/2.2 M formaldehyde gel, and transferred to nylon 
membranes (Micro Separations Inc., MA) using methods well- 
known to those skilled in the art (see, for example, 
Sambrook et al., supra , (1989)), Membranes were probed 
using the EcoRI insert of pPR0TA-C2GnT (see below) 
radiolabeled with ["P]dCTP by the random priming method 
(Feinberg and Vogelstein, supra . (1983). Hybridization was 
performed in buffers containing 50% formamide for 24 hr at 
42^C (Sambrook et al., supra , (1989)). Following 
hybridization, filters were washed several times in 
lxSSPE/0.1% SDS at room temperature and once in 
O.lxSSPE/0.1% SDS at 42**C, then exposed to Kodak X-AR film 
at -70^C. 

Fig. 6 compares the level of core 2 i31->6 W- 
30 acetylglucosaminyltransf erase mRNA isolated from HL-60 
promyelocytes, K562 erythroleukemia cells, and poorly 
metastatic SP and highly metastatic L4 colonic carcinoma 
cells. The major RNA, species migrates at a size 
essentially identical to the -2.1 kb C2GnT cDNA sequence. 



15 



20 



25 
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The same result is observed for HL-60 cells cind the two 
colonic cell lines, which apparently synthesize the 
hexasaccharides . In addition, two transcripts of -3.3 kb 
and 5 . 4 Icb in size were, detected in these cell lines . The 
5 two larger transcripts may result from differential usage 
of polyadenylation signals. 

No hybridization occurred with poly (A)* RNA 
isolated from K562 cells, which lack the hexasaccharide, 
but synthesize the tetrasaccharide (Carlsson et al., supra . 
10 (1986)), which is incorporated herein by reference. 
Similarly, no hybridization was observed for poly (A)* RNA 
isolated from CHO-Py-leu cells (Fig. 6, lane 1). 

EXAMPLE V 



15 



20 



EXPRESSION OF ENZTO ATTgftT.T. Y ACTIVE Bl'»6 N- 
ACETyi.GI.UCOSAMINYLTRANSFERASE 

In order to confirm that C2GnT cDNA encodes for 
core 2 fll-*6 W-acetylglucosaminyltransf erase, enzymatic 
activity was examined in CHO-Py-leu cells trans fected with 
pcDNAI or pcDNAI-C2GnT. Following a 64 hr period to allow 
transient expression, cell lysates were prepared and core 
2 fll->6 JV-acetylglucosaminyltransferase activity was 
measured . 

W-acetylglucoseuninyltransf erase assays were 
performed essentially as described by Saitoh et al . , supra . 
25 (1991), Yousefi et al., supra . (1991), and Lee et al., 

Biol. Chem. 265:20476-20487 (1990), which is incorporated 
herein by reference. Each reaction contained 50 mM MES, 
pH7.0, 0.5 jjCx of UDP-t'H]GlcNAc in 1 mM UDP-GlcNAc, 0.1 M 
GlcNAc, 10 mM Na,EDTA, ImM of acceptor and 25 jjl of either 
cell lysate, cell supernatant or IgG-Sepharose matrix in a 
total reaction volime of 50 fjl. 



30 
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Reactions were incubated for 1 hr at 37**C, then 
processed by C18 Sep-Pak chromatography (Waters) (Palcic et 
al,, J, Biol> Chem. 265:6759-6769 (1990), which is 
incorporated herein by reference). Core 2 and core 4 Bl-»6 
5 //-acetylglucosconinyltransf erase were assayed using the 
acceptors p-nitrophenyl GalJ31-*3GalNAc and p-nitrophenyl 
GlcNAcfil->3GalNAc , respectively (Toronto Research 
Chemicals ) • 

UDP"GlcNAc:a-Man J31-*6 N ~ 
acetylglucosaminyltransf erase (V) was assayed using the 
acceptor GlcNAcBl->2Manal->6Glc-J3-0- (CH2 ) 7CH3 . The blood group 
I enzyme, UDP-GlcNAc:GlcNAcJ31->3Galfll->4GlcNAc (GlcNAc to 
Gal) J31-*6 W-acetylglucosaminyltransf erase, was assayed 
using GlcNAcJil^3Galfll->4GlcllAcfll-»6Manal'*6ManBl-^0-(CH2) 8COOCH3 
or GaliJ l->4 GlcNAcfi 1^3Galfll^4GlcNAcJ3 l->3GalB l->4GlcNAcJ3 1-*0- 
(CH2)7CH3 as acceptors (Gu et al., J. Biol. Chem. 267:2994- 
2999 (1992), which is incorporated herein by reference). 
Synthetic acceptors were kindly provided by Dr. Ole 
Hindsgaul, University of Alberta, Canada. 

20 Results of these assays are shown in Table I. 

Assuming transfection efficiency of the cells is 
approximately 20-30%, the level of enzymatic activity 
directed by cells transfected with pcDNAI-C2GnT is roughly 
equivalent to the level observed in HL-60 cells. 



10 



15 
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TABZiE I 



™ « ^-acetylglucosaminyltransf erase activity in 

CHO-Py-leu cell extracts transf ected with pcDNAI or 
pcDNAI-C26nT 





Vector 


Core 2 fil-*6 GlcNAc transferase 
activity (pmol/mg of protein/hr) 




pcDNAI 


n.d. 


10 


pcDNAI-C2GnT 


764 



cells were transf ected with pcDNAI or pcDNAI- 
^^^V^' described in the specification. Endogenous 

ic activity was measured in the absence of acceptor and 
15 subtracted from values determined in the presence of added 
acceptor. GalJ31-»3GalNAca-p-nitrophenyl was used as an 
acceptor, n.d. = not detectable. For comparison, the core 
^ Jll-»6 W-acetylglucosaminyltransferase activity measured in 
HL-60 cells under identical conditions was 3228 pmol/mg of 
20 protein per hr. f a 

In order to unequivocally establish that C2GnT 
cDNA sequence encodes core 2 131-»6 N- 
acetylglucosaminyltransf erase, plasmid, pPROTA-C2GnT was 
constructed containing the DNA sequence encoding the 

25 putative catalytic domain of core 2 i31-6 N- 
acetylglucosaminyltransferase fused in frame with the 
signal peptide and IgG binding domain of S. aureus protein 
A (Fig. 7). The putative catalytic domain is contained in 
a 1330 bp fragment of the C2GnT cDNA that encodes amino 

30 acid residues 38 to 428. Plasmid pPROTA was kindly 
provided by Dr. John B. Lowe. 

The polymerase chain reaction (PGR) was used to 
insert EcoRI recognition sites on either side of the 1330 
bp sequence in pcDNAI-C2GnT DNA. PGR was performed using 
35 the synthetic oligonucleotide primers 5'- 
TTTGAATTCCCCTGAATTTGTAAGTGTCAGACAC-3' (SEQ. ID. NO. 6) and 
5'-TTTGAATTCGCAGAAACCATGCAGCTTCTCTGA-3' (SEQ. ID. NO. 7) 
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(EcoRI recognition sites underlined). The EcoRI sites 
allowed direct^ in-frame insertion of the fragment into the 
unique EcoRI site of plasmid pPROTA (Sanchez-Lopez et al., 
J. Bio l. Chem. 263:11892-11899 (1988), which is 
5 incorporated herein by reference) . 

The nucleotide sequence of the insert as well as 
the proper orientation were confirmed by DNA sequencing 
using the primers described above for cDNA sequencing. 
Plasmid pPROTA-C2GnT allows secretion of the fusion protein 
10 from transfected cells and binding of the secreted fusion 
protein by insolubilized immunoglobulins. 

Either pPROTA or pPR0TA-C2GnT was transfected 
into COS-1 cells. Following a 64 hr period to allow 
transient expression / cell supernatants were collected 

15 (Kukowska-Latallo et al., supra , (1990)). Cell 
supernatants were cleared by centrif ugation, adjusted to 
0.05% Tween 20 and either assayed directly for core 2 fll->6 
W-acetylglucosaminyltransf erase activity or used in IgG- 
Sepharose (Pharmacia) binding studies. For the latter 

20 assay, supernatants (10 ml) were incubated batchwise with 
approximately 300 ^1 of IgG-Sepharose for 4 hr at 4*C. The 
matrices were then extensively washed and used directly for 
glycosyltransf erase assays. 

No core 2 J31->6 W-acetylglucosaminyltransf erase 
25 activity was detected in the medium of COS-1 cells 
transfected with the control plasmid^ pPROTA. Similarly, 
no enzymatic activity was associated with IgG-Sepharose 
beads. In contrast, a significant level of core 2 fll->6 W- 
acetylglucosaminyltransferase activity was detected in the 
30 medium of CGS-1 cells transfected with pPR0TA-C2GnT . The 
activity also associated with the IgG-Sepharose beads 
(Table II) . No activity was detected in the supernatant 
following incubation of the supernatant with IgG-Sepharose. 
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TABIDS II 



Determination of Enaymatic Activities Directed bv 
pPR0TA-C2GnT. 



Acceptors and 
linkages formed 



Radioac t ivi ty ( cjm ) 
with { + ) and witkout 
(-) acceptor 



10 



GlcNAcBl 



Gali31-»3GalNAc 
(core 2-GnT) 



109 



1048 



15 



GlcNAaBl 

6 

GlcNAcii l-»3GalNAc 
(core 4-GnT) 



111 



113 



GlcNAcBl 

6 

20 GlcNAcfil->2Man us 115 

(GnTV) 

GlcNAcBl 

6 

GlcNAcBl-SGal m 113 

25 (I-GnT) 



GlcNAcBl 

6 

Galfll->4GlcNAcfll-»'3Gal 99 gg 

(I-GnT) 



30 



SS^^J"^.^ ,f "^^^ transfected with pPROTA-C2GnT and the 
"^^^^^ "f" incubated with IgG-Sepharose. The 
proteins bound to the IgG-Sepharose were assayed for fll-»6 
w-acetylglucosaminyltransf erase activity by usina 
35 appropriate acceptors. The linkages formed are indicated 

TL^t i^^^f* Similar results were obtained in three 
independent experiments . 
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EXAMPLE VI 

DETERMINATION OF C2GnT SPECIFICITY 

Four types of R 1 ^ 6 N - 
acety Iglucosaminy It ransf erase linkages have been reported^ 
5 including core 2 and core 4 in O-glycans, I-antigen and a 
branch attached to mannose that forms tetraantennary 
glycans (see Table II), In order to determine whether 
these different structures are also synthesized by the 
cloned C2GnT cDNA sequence^ enzymatic activity was 
10 determined using five different acceptors. 

As shown in Table II, the fusion protein was only 
active with the acceptor for core 2 formation. The same 
was true when the formation of fll->6 W-acetylglucosaminyl 
linkage to internal galactose residues was examined (Table 

15 II, see structure at bottom). This result precludes the 
likelihood that the enzyme encoded by the C2GnT cDNA 
sequence may add W-acetylglucosamine to a non-reducing 
terminal galactose. The HL-60 core 2 J31->6 N~ 

acetylglucosaminyltransf erase is exclusively responsible 

20 for the formation of the GlcNAci31->6 branch on Gali31^3 
GalNAc . 

Although the invention has been described with 
reference to the disclosed embodiments, it should be 
iinderstood that various modifications can be made without 
25 departing from the spirit of the invention. Accordingly, 
the invention is limited only by the following claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: LA JOLLA CANCER RESEARCH FOUNDATION 



(ii) TITLE OF INVENTION: A NOVEL BETAl-6 

N-ACETYLGLUCOSAMINYLTRANSFERASE, ITS ACCEPTOR MOLECULE, 
LEUKOSIAIiIN AND A METHOD FOR CLONING PROTEINS HAVING 
ENZYMATIC ACTIVITY 

(111) NUMBER OF SEQUENCES: 8 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: CAMPBELL AND FLORES 

(B) STREET: 4370 LA JOLLA VILLAGE DRIVE, SUITE 700 

(C) CITY: SAN DIEGO 

(D) STATE: CALIFORNIA 

(E) COUNTRY: USA 

(F) ZIP: 92122 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 

(Vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 30 September 1993 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: KONSKI, ANTOINETTE F. 

(B) REGISTRATION NUMBER: 34,202 

(C) REFERENCE /DOCKET NUMBER: FP-LJ 9756 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 619-535-9001 

(B) TELEFAX: 619-535-8949 



(2) INFORMATION FOR SEQ ID NOsl: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 900 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 841. ,900 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 91 ..192 

(D) OTHER INFORMATION: /note= "EXON I'lS LOCATED IN BOTH 
GENOMIC AND CDNA, IN THE CDNA KXON 1' IS 
IMMEDIATELY FOLLOWED BY EXON 2 . *• 
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( Ix ) FEATURE : 

(A) NAME/KEY: exon 

(B) LOCATION: 359.. 428 

(D) OTHER INFORMATION: /note= -EXON 1 IS LOCATED IN 
GENOMIC DNA** 

(ix) FEATURE: 

(A) NAME /KEY: Intron 

(B) LOCATION: 193*. 806 

(D) OTHER INFORMATION: /note= "THIS SEGMENT OF NUCLEIC 
ACID CONSTITUTES INTRON SEQUENCE OF THE CDNA" 

( ix ) FEATURE : 

(A) NAME/KEY: exon 

(B) LOCATION: 807.. 900 

(D) OTHER INFORMATION: /note= "EXON 2 IS LOCATED IN BOTH 
GENOMIC AND cDNA. IN THE cDNA EXON 2 IMMEDIATELY 
FOLLOWS EXON 1 ' . " 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

TTGGGGACCA CAAATGCAAA GGAAACCACC CTCCCCTCCC ACCTCCTCCT CTGCACCCTT 60 

GAGTTCTCAG GCTCACATTC CCACCACCCA CCTCTGAGCC CAGCCCTCCC TAGCATCACC 120 

ACTTCCATCC CATTCCTCAG CCAAGAGCCA GGAATCCTGA TTCCAGATCC CACGCTTCCC 180 

TGCCTCCCTC AGGTGAGCCC CAGACCCCCA GGCACCCCGC TGGCCCCTGA AGGAGCAGGT 240 

GATGGTGCTG TCTTCGCCCA GCAGCTGTGG GAGCAGGCGG GTGGGGCAGG ATGGAGGGGT 300 

GGGTGGGGTG GGTGGAGCCA GGGCCCACTT CCTTTCCCCT TGGGGCCCTG TCCTTCCCAG 360 

TCTTGCCCCA GCCTCGGGAG GTGGTGGAGT GACCTGGCCC CAGTGCTGCG TCCTTATCAG 420 

CCGAGCCGGT AAGAGGGTGA GACTTGGTGG GGTAGGGGCC TCAGTGGGCC TGGGAATGTG 480 

CCTGTGGCTT GAAAAGACTC TGACAGGTTA TGATGGGAAG AGATTGGGAG CCATTGGGCT 540 

GCACAGGGTC AGGGAAGGCC AGGAGGGGCT GGTCACTGCT GGAATCTAAG CTGCTGAGGC 6 00 

TGGAGGGAGC CTCAGGATGG GGCTGATGGG GGAGCTGCCA GCATCTGTTC CTCTGTCATT 660 

TCTGATAACA GTAAAAGCCA GCATGGAAAA AACCGTTAAA CCGCAGGTTG GGCCTGGCCG 720 

TTGGCAGGGA AGTGGGCAGA GGGGAGGCCC GGCCAGGTCC TCCGGCAACT CCCGCGTGTT 7 80 

CTGCTTCTCC GGCTGCCCAC CTGCAGGTCC CAGCTCTTGC TCCTGCCTGT TTGCCTGGAA 840 

ATG GCC ACG CTT CTC CTT CTC CTT GGG GTG CTG GTG GTA AGC CCA GAC 888 
Met: Ala Thr Leu Leu Leu Leu Leu Gly Val Leu Val Val Ser Pro Asp 
15 10 15 

GCT CTG GGG AGC 9 00 

Ala Leu Gly Ser 
20 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUEBCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUEHCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala Thr Leu Leu Leu Leu Leu Gly Val Leu Val Val Ser Pro Asp 
15 10 15 

Ala Leu Gly ser 
20 



(2) INFORMATION FOR SEO ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2105 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 



(iX) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 220., 1504 

(ix) FEATURE; 

(A) NAME/KEY: polyA_signal 

(B) LOCATION: 1913,, 1918 

(ix) FEATURE: 

(A) NAME/KEY: misc^signal 

(B) LOCATION: 248. .314 

(D) OTHER INFORMATION: /standard_name= 

"SIGNAL/MEMBRANE-ANCHORING DOMAIN" 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GTGAAGTGCT CAGAATGGGG CAGGATGTCA CCTGGAATCA GCACTAAGTG ATTCAGACTT 60 

TCCTTACTTT TAAATGTGCT GCTCTTCATT TCAAGAT<3CC GTTGCAGCTC TGATAAATGC 120 

AAACTGACAA CCTTCAAGGC CACGACGGAG GGAAAATCAT TGGTGCTTGG AGCATAGAAG 180 

ACTGCCCTTC ACAAAGGAAA TCCCTGATTA TTGTTTGAA ATG CTG AGG ACG TTG 234 

Met Leu Arg Thr Leu 
1 5 

CTG CGA AGG AGA CTT TTT TCT TAT CCC ACC AAA TAC TAC TTT ATG GTT 282 
Leu Arg Arg Arg Leu Phe ser Tyr Pro Thr Lys Tyr Tyr Phe Met Val 
10 15 20 

CTT GTT TTA TCC CTA ATC ACC TTC TCC GTT TTA AGG ATT CAT CAA AAG 330 
Leu Val Leu Ser Leu lie Thr Phe Ser Val Leu Arg lie Hie Gin Lys 
25 30 35 



CCT GAA TTT GTA AGT GTC AGA CAC TTG GAG CTT GCT GGG GAG AAT CCT 378 
Pro Glu Phe Val ser Val Arg His Leu Glu Leu Ala Gly Glu Asn Pro 
40 45 50 
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AGT AGT GAT ATT AAT TGC ACC AAA GTT TTA CAG GGT GAT GTA AAT GAA 426 

Ser Ser Asp lie Asn cys Thr Lys Val Leu Gin Gly Asp Val Asn Glu 
55 60 65 

ATC CAA AAG GTA AAG CTT GAG ATC CTA ACA GTG AAA TTT AAA AAG CGC 474 
lie Gin Z4ys Val Xiys Leu Glu Xle Leu Thr Val Lys Phe Lya Lys Arg 
70 75 80 85 

CCT CGG TGG ACA CCT GAC GAC TAT ATA AAC ATG ACC AGT GAC TGT TCT 522 
Pro Arg Trp Thr Pro Asp Asp Tyr lie Asn Met Thr ser Asp Cys Ser 
90 95 100 

TCT TTC ATC AAG AGA CGC AAA TAT ATT GTA GAA CCC CTT AGT AAA GAA 570 
Ser Phe lie Lya Arg Arg Lya Tyr lie Val Glu Pro Leu Ser Lys Glu 
105 110 115 . 

GAG GCG GAG TTT CCA ATA GCA TAT TCT ATA GTG GTT CAT CAC AAG ATT 618 
Glu Ala Glu Phe Pro lie Ala Tyr Ser He Val Val His Hia Lys He 
120 125 130 

GAA ATG CTT GAC AGG CTG CTG AGG GCC ATC TAT ATG CCT CAG AAT TTC 666 
Glu Met Leu Asp Arg Leu Leu Arg Ala He Tyr Met Pro Gin Asn Phe 
135 140 145 

TAT TGC GTT CAT GTG GAC ACA AAA TCC GAG GAT TCC TAT TTA GCT GCA 714 
Tyr Cys Val His Val Asp Thr Lys Ser Glu Asp Ser Tyr Leu Ala Ala 
150 155 160 165 

GTG ATG GGC ATC GCT TCC TGT TTT AGT AAT GTC TTT GTG GCC AGC CGA 762 
Val Met Gly He Ala Ser Cys Phe Ser Asn Val Phe Val Ala Ser Arg 
170 175 180 

TTG GAG AGT GTG GTT TAT GCA TCG TGG AGC CGG GTT CAG GCT GAC CTC 810 
Leu Glu Ser Val Val Tyr Ala Ser Trp Ser Arg Val Gin Ala Asp Leu 
185 190 195 

AAC TGC ATG AAG GAT CTC TAT GCA ATG AGT GCA AAC TGG AAG TAG TTG 858 
Asn cys Met Lya Asp Leu Tyr Ala Met Ser Ala Asn Trp Lys Tyr I*eu 
200 205 210 

ATA AAT CTT TGT GGT ATG GAT TTT CCC ATT AAA ACC AAC CTA GAA ATT 9 06 

He Asn Leu Cys Gly Met Asp Phe Pro He Lys Thr Asn Leu Glu He 
215 220 225 

GTC AGG AAG CTC AAG TTG TTA ATG GGA GAA AAC AAC CTG GAA ACG GAG 954 
Val Arg Lys Leu Lya Leu Leu Met Gly Glu Asn Asn Leu Glu Thr Glu 
230 235 240 245 

AGG ATG CCA TCC CAT AAA GAA GAA AGG TGG AAG AAG CGG TAT GAG GTC 1002 

Arg Met Pro Ser His Lya Glu Glu Arg Trp Lya Lya Arg Tyr Glu Val 
250 255 260 

GTT AAT GGA AAG CTG ACA AAC ACA GGG ACT GTC AAA ATG CTT CCT CCA 1050 
Val Asn Gly Lya Leu Thr Asn Thr Gly Thx: Val Lys Met Leu Pro Pro 
265 270 275 

CTC GAA ACA CCT CTC TTT TCT GGC AGT GCC TAC TTC GTG GTC AGT AGG 1098 
Leu Glu Thr Pro Leu Phe Ser Gly Ser Ala Tyr Phe Val Val Ser Arg 
280 285 290 

GAG TAT GTG GGG TAT GTA CTA CAG AAT GAA AAA ATC CAA AAG TTG ATG 1146 
Glu Tyr Val Gly Tyr Val Leu Gin Asn Glu Lys He Gin Lys Leu Met 
295 300 305 

GAG TGG GCA CAA GAC ACA TAC AGC CCT GAT GAG TAT CTC TGG GCC ACC 1194 

Glu Trp Ala Gin Asp Thr Tyr Ser Pro Asp Glu Tyr Leu Trp Ala Thr 
310 315 320 325 



wo 94/07M7 



PCr/US93/09303 



40 

ATC CAA. AGO ATT CCT GAA GTC CCG GGC TCA CTC CCT GCC AGC CAT AAG 1242 

lie Gin Arg lie Pro Glu val Pro Gly ser l*eu Pro Ala ser His Lya 
330 335 340 

TAT GAT CTA TCT GAC ATG CAA GCA GTT GCC AGG TTT GTC AAG TGG GAG 1290 
Tyr Asp Leu Ser Asp Met Gin Ala Val Ala Arg Phe Val Lys Trp Gin 
345 350 355 

TAG TTT GAG GGT GAT GTT TCC AAG GGT GCT CCC TAC CCG CCC TGC GAT 1338 
Tyr Phe Glu Gly Asp Val Ser Lys Gly Ala Pro Tyr Pro Pro Cys Asp 
36 0 365 370 

GGA GTC CAT GTG CGC TCA GTG TGC ATT TTC GGA GCT GGT GAC TTG AAC 1386 
Gly VaX His Val Arg Ser Val Cys lie Phe Gly Ala Gly Asp Leu Asn 
375 380 385 

TGG ATG CTG CGC AAA CAC CAC TTG TTT GCC AAT AAG TTT GAC GTG GAT 1434 
Trp Met Leu Arg Lys His His Leu Phe Ala Asn Lys Phe Asp Val Asp 
390 395 400 405 

GTT GAC CTC TTT GCC ATC CAG TGT TTG GAT GAG CAT TTG AGA CAC AAA 1482 
Val Asp Leu Phe Ala lie Gin Cys Leu Asp Glu His Leu Arg His Lys 
410 415 420 

GCT TTG GAG ACA TTA AAA CAC T GACCATTACG GGCAATTTTA TGAACAAGAA 1534 

Ala Leu Glu Thr Leu Lys His 
425 



GAAGGATACA 


CAAAACGTAC 


CTTATCTGTT 


TCCCCTTCCT 


TGTCAGCGTC 


GGGAAGATGG 


1594 


TATGAAGTCC 


TCTTTGGGGC 


AGGGACTCTA 


GTAGATCTTC 


TTGTCAGAGA 


AGCTGCATGG 


1654 


TTTCTGCAGA 


GCACAGTTAG 


CTAGAAAGGT 


GATAGCATTA 


AATGTTCATC 


TAGAGTTAAT 


1714 


AGTGGGAGGA 


GTAAAGGTAG 


CCTTGAGGCC 


AGAGCAGGTA 


GCAAGGCATT 


GTGGAAAGAG 


1774 


GGGACCAGGG 


TGGCTGGGGA 


AGAGGCCGAT 


GCATAAAGTC 


AGCCTGTTCC 


AAGTGCTCAG 


1834 


GGACTTAGCA 


AAATGAGAAG 


ATGTGACCTG 


TGCCAAAACT 


ATTTTGAGAA 


TTTTAAATGT 


1894 


GACCATTTTT 


CTGGTATGAA 


TAAACTTACA 


GCAACAIU^TA 


ATCAAAGATA 


CAATTAATCT 


1954 


GATATTATAT 


TTGTTGAAAT 


AGAAATTTGA 


TTGTACTATA 


AATGATTTTT 


GTAAATAATT 


2014 


TATATTCTGC 


TCTAATACTG 


TACTGTGTAG 


TGTGTCTCCG 


TATGTCATCT 


CAGGGAGCTT 


2074 


AAAATGGGCT 


TGATTTAACA 


TTGAAAAAAA 


A 






2105 



(2) INFORMATION FOR SKQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 428 amino acids 

(B) TYPEi amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Leu Arg Thr Leu Leu Arg Arg Arg Leu Phe Ser Tyr Pro Thr Lys 
15 10 15 

Tyr Tyr Phe Met Val Leu Val Leu Ser Leu He Thr Phe ser Val Leu 
20 25 30 
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Arg lie His Gin Lys Fro Glu Fhe Val Ser Val Arg His lieu Glu Leu 
35 40 45 

Ala Gly Glu Asn Pro Ser ser Asp He Asn Cys Thr Lys Val Leu Gin 
50 55 (0 

Gly Asp Val Asn Glu He Gin Lys Val Lys Leu Glu He Leu Thr Val 
65 70 75 80 

Lys Phe Lys Lys Arg Pro Arg Trp Thr Pro Asp Asp Tyr He Asn Met 
85 90 95 

Thr Ser Asp cys ser Ser Phe He Lys Arg Arg Lys Tyr He Val Glu 
100 105 110 

Pro Leu Ser Lys Glu Glu Ala Glu Phe Pro He Ala Tyr ser He Val 
115 120 125 

Val His His Lys He Glu Het Leu Asp Arg Leu X«eu Arg Ala lie Tyr 
130 135 140 

Met Pro Gin Asn Phe Tyr cys Val His val Asp Thr Lys Ser Glu Asp 
145 150 155 160 

Ser Tyr Leu Ala Ala Val Met Gly He Ala Ser Cys Phe Ser Asn val 
165 170 175 

Phe Val Ala Ser Arg Leu Glu Ser Val Val Tyr Ala Ser Trp Ser Arg 
180 185 190 

Val Gin Ala Asp Leu Asn Cys Met Lys Asp Leu Tyr Ala Met Ser Ala 
195 200 205 

Asn Trp Lys Tyr Leu He Asn Leu Cys Gly Met Asp Phe pro He Lys 
210 215 220 

Thr Asn Leu Glu He Val Arg Lys Leu Lys Leu Leu Met Gly Glu Asn 
225 230 235 240 

Asn Leu Glu Thr Glu Arg Met Pro Ser His Lys Glu Glu Arg Trp Lys 
245 250 255 

Lys Arg Tyr Glu val val Asn Gly Lys Leu Thr Asn Thr Gly Thr Val 
260 265 270 

Lys Met Leu Pro Pro Leu Glu Thr Pro Leu Phe Ser Gly ser Ala Tyr 
275 280 285 

Phe Val Val ser Arg Glu Tyr Val Gly Tyr val Leu Gin Asn Glu Lys 
290 295 300 

He Gin Lys Leu Met Glu Trp Ala Gin Asp Thr Tyr Ser Pro Asp Glu 
305 310 315 320 

Tyr Leu Trp Ala Thr He Gin Arg He Pro Glu Val Pro Gly Ser Leu 
325 330 335 

Pro Ala Ser His Lys Tyr Asp Leu Ser Asp Met Gin Ala Val Ala Arg 
340 345 350 

Phe Val Lys Trp Gin Tyr Phe Glu Gly Asp Val Ser Lys Gly Ala Pro 
355 360 365 

Tyr Pro Pro Cys Asp Gly Val His Val Arg Ser Val cys He Phe Gly 
370 375 380 
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Ala Gly Asp Leu Aan Trp Met Leu Arg Lys His His Leu Phe Ala asb 
385 390 395 409 

Lys Phe Asp Val Asp Val Asp Leu Phe Ala lie Gin cys Leu Asp Gl« 
405 410 415 

His Leu Arg His Lys Ala Leu Glu Thr Leu Lys His 
420 425 

(2) INFORMATION FOR SEQ ID NO: 5: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

TTTGAATTCC CCTGAATTTG TAAGTGTCAG ACAC 34 

<2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 33 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQXXENCE DESCRIPTION: SEQ ID NO: 6: 
TTTGAATTCG CAGAAACCAT GCAGCTTCTC TGA 33 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE: internal 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..15 

(D) OTHER INFORMATION: /note= "PROTEIN A - C2GNT FUSION 
PROTEIN** 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GGG AAT TCC CCT GAA 15 
Gly Aan Ser Pro Glu 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

<A) IiENGTH: 5 ami no acids 
(B) TYPE: amino acid 
< D ) TOPOIiOGY : linear 

(11) MOLECUiiE TYPE: protein 

(Xl) SEQUENCE DESCRIPTION: SEQ ZD NO: 8: 

Gly Asn Ser Pro Glu 
1 5 
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CIiAIMS 

We claim: 

1. A purified human protein or an active 
fragment thereof having Bl->6 N ~ 

5 acetylglucosaminyltransf erase activity. 

2. The purified protein of claim 1, wherein 
said activity is that of UDP-GlcNAc:GalJ31-»3GalNAc (GlcNAc 
to GalNAc) fll->6 J/-acetylglucosaminyltransf erase. 

3. The purified protein of claim 2, wherein 
10 said protein has a relative molecular weight of about 50 

kD. 

4. An isolated nucleic acid encoding the human 
protein or active fragment thereof of claim 1. 

5. A vector containing the nucleic acid of 

15 claim 4. 

6. The vector of claim 5, wherein said vector 
is a plasmid. 

7. The vector of claim 5, wherein said vector 
, is pcDNAI-C2GnT. 

20 8. A host cell containing the vector of claim 

5. 

9. A purified human protein or a fragment 
thereof that is an acceptor molecule, said acceptor 
molecule being acted upon by the protein of claim 2 having 
activity which exclusively forms core 2 oligosaccharide 
5 structures in O-glycans. 



1 
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10. The acceptor molecule of claim 9, wherein 
said acceptor molecule is leukosialin, CD43. 

11. An isolated nucleic acid encoding the 
acceptor molecule of claim 9. 

12 . A vector containing the nucleic acid of 

claim 11. 

13. The vector of claim 12^ wherein said vector 
is a plasmid. 

14. The vector of claim 12/ wherein said vector 
is pcDSRa-leu. 

15. A host cell containing the vector of claim 

12. 

16. A method of obtaining from a cell line^ 
which does not normally contain a protein having catalytic 
activity or an acceptor molecule for said protein, a 
nucleic acid encoding said protein having catalytic 

5 activity comprising: 

a. trans feet ing said cell line with a DNA 
sequence encoding the acceptor molecule, wherein the 
acceptor molecule is stably expressed in the cell line; 

b. transfecting said cell line with a cDNA 
10 library containing said nucleic acid in a vector, wherein 

proteins encoded by the transfected cDNA are transiently 
expressed; 

c. screening the transfected cells for 
expression of said protein having catalytic activity; and 

15 d. isolating the nucleic acid encoding the 

protein having catalytic activity. 
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17. The vector of claim 16^ wherein said vector 
replicates in the transfected cell line. 

18* The vector in claim 17, wherein said vector 
is a plasmid. 

19. The vector of claim 16, wherein said vector 
contains a viral replication origin. 

20. The vector of claim 19, wherein said 
replication origin is the polyoma virus replication origin. 

21. The cell line of claim 16, wherein said cell 
line supports replication of a vector. 

22. The cell line of claim 16, wherein said cell 
line expresses polyoma virus large T antigen. 

23. The cell line of claim 16, wherein said cell 
line is the Chinese hamster ovary cell line. 

24. The cell line of claim 23, wherein said cell 
line is CHO-Py-leu. 

25. A method of isolating a polypeptide having 
catalytic activity that forms core 2 oligosaccharide 
structures in O-glycans, said method comprising growing the 
host cell of claim 8 under conditions which favor 

5 expression of a nucleic acid encoding said polypeptide, and 
isolating said polypeptide so produced. 
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